Linguistic mechanism of the evolution of amino acid frequencies and genomic GC content
نویسنده
چکیده
Much information is stored in amino acid composition of protein and base composition of DNA. We simulated the evolution of amino acid frequencies and genomic GC content by a linguistic model. It is showed that the evolution of genetic code determines the evolution of amino acid frequencies and genomic GC content. We explained the relationships among amino acid frequencies, genomic GC content and protein length distribution in a unified theoretical framework. Especially, the simulations of the evolution of amino acid frequencies and the codon position GC content agree dramatically with the results based on the data of all known genomes so far. Furthermore, we found that the space of average protein length in proteome and ratio of amino acid frequencies is useful to describe the phylogeny and evolution. Amazingly, the dots of all the species in this space form an evolutionary flow. We believe that the amino acid gain and loss is motivated by the established pattern of the variation of amino acid frequencies. The linguistic mechanism is helpful to unveil the origin of the genetic code. Email: [email protected]
منابع مشابه
Holographic bound and protein linguistics
The holographic bound in physics constrains the complexity of life. The finite storage capability of information in the observable universe requires the protein linguistics in the evolution of life. We find that the evolution of genetic code determines the variance of amino acid frequencies and genomic GC content among species. The elegant linguistic mechanism is confirmed by the experimental o...
متن کاملAnalysis of the Relationship between Genomic GC Content and Patterns of Base Usage, Codon Usage and Amino Acid Usage in Prokaryotes: Similar GC Content Adopts Similar Compositional Frequencies Regardless of the Phylogenetic Lineages
The GC contents of 2670 prokaryotic genomes that belong to diverse phylogenetic lineages were analyzed in this paper. These genomes had GC contents that ranged from 13.5% to 74.9%. We analyzed the distance of base frequencies at the three codon positions, codon frequencies, and amino acid compositions across genomes with respect to the differences in the GC content of these prokaryotic species....
متن کاملAmino Acid Usage Is Asymmetrically Biased in AT- and GC-Rich Microbial Genomes
INTRODUCTION Genomic base composition ranges from less than 25% AT to more than 85% AT in prokaryotes. Since only a small fraction of prokaryotic genomes is not protein coding even a minor change in genomic base composition will induce profound protein changes. We examined how amino acid and codon frequencies were distributed in over 2000 microbial genomes and how these distributions were affec...
متن کاملAcross Bacterial Phyla, Distantly-Related Genomes with Similar Genomic GC Content Have Similar Patterns of Amino Acid Usage
The GC content of bacterial genomes ranges from 16% to 75% and wide ranges of genomic GC content are observed within many bacterial phyla, including both gram negative and gram positive phyla. Thus, divergent genomic GC content has evolved repeatedly in widely separated bacterial taxa. Since genomic GC content influences codon usage, we examined codon usage patterns and predicted protein amino ...
متن کاملGC-Content of Synonymous Codons Profoundly Influences Amino Acid Usage
Amino acids typically are encoded by multiple synonymous codons that are not used with the same frequency. Codon usage bias has drawn considerable attention, and several explanations have been offered, including variation in GC-content between species. Focusing on a simple parameter-combined GC proportion of all the synonymous codons for a particular amino acid, termed GCsyn-we try to deepen ou...
متن کامل